Pii: S0893-6080(97)00147-0
نویسنده
چکیده
The relation between hard c-means (HCM), fuzzy c-means (FCM), fuzzy learning vector quantization (FLVQ), soft competition scheme (SCS) of Yair et al. (1992) and probabilistic Gaussian mixtures (GM) have been pointed out recently by Bezdek and Pal (1995). We extend this relation to their training, showing that learning rules by these models to estimate the cluster centers can be seen as approximations to the expectation–maximization (EM) method as applied to Gaussian mixtures. HCM and unsupervised, LVQ use 1-of-c type competition. In FCM and FLVQ, membership is the 12/(m 1 1)th power of the distance. In SCS and GM, Gaussian function is used. If the Gaussian membership function is used, the weighted within-groups sum of squared errors used as the fuzzy objective function corresponds to the maximum likelihood estimate in Gaussian mixtures with equal priors and covariances. The fuzzy clustering method named fuzzy c-means alternating optimization procedure (FCM-AO) proposed to optimize the former is then equivalent to batch EM and SCS’s update rule is a variant of the online version of EM. The advantages of the probabilistic framework are: (i) we no longer have spurious spread parameters that needs fine tuning as m in fuzzy vector quantization or b in SCS; instead we have a variance term that has a sound interpretation and that can be estimated from the sample; (ii) EM guarantees that the likelihood does not decrease, thus it converges to the nearest local optimum; (iii) EM also allows us to estimate the underlying distance norm and the cluster priors which we could not with the other approaches. We compare Gaussian mixtures trained with EM with LVQ (HCM), SCS and FLVQ on the IRIS dataset and see that it is more accurate due to its being able to take into account the covariance information. We finally note that vector quantization is generally an intermediate step before finding a final output for which supervision may be possible. Thus, instead of an uncoupled approach where an unsupervised method is used first to find the cluster parameters followed by supervised training of the mapping based on the memberships, we advocate a coupled approach where the cluster parameters and mapping are trained supervised in a coupled way. The uncoupled approach ignores the error at the outputs which may not be ideal. q 1998 Elsevier Science Ltd. All rights reserved.
منابع مشابه
Regularization with a Pruning Prior
We investigate the use of a regularization prior and its pruning properties. We illustrate the behavior of this prior by conducting analyses both using a Bayesian framework and with the generalization method, on a simple toy problem. Results are thoroughly compared with those obtained with a traditional weight decay. Copyright 1997 Elsevier Science Ltd.
متن کاملPii: S0893-6080(97)00012-9
Kohonen’s learning vector quantization (LVQ)is modifiedby attributingtrainingcountersto eachneuron, whichrecordits trainingstatistics.Duringtraining,thisallowsfor dynamicself-allocationof theneuronsto classes.In the classificationstage trainingcountersprovidean estimateof the reliabilityof classificationof the singleneurons, whichcan be exploitedto obtaina substantiallyhigherpurity of classi$ca...
متن کاملPrecision Requirements for Closed-Loop Kinematic Robotic Control Using Linear Local Mappings
Neural networks are approximation techniques that can be characterized by adaptability rather than by precision. For feedback systems, high precision can still be acquired in presence of errors. Within a general iterative framework of closed-loop kinematic robotic control using linear local modeling, the inverse Jacobian matrix error and the maximum length of the displacement for which the line...
متن کاملEstimates of the Number of Hidden Units and Variation with Respect to Half-Spaces
We estimate variation with respect to half-spaces in terms of "flows through hyperplanes". Our estimate is derived from an integral representation for smooth compactly supported multivariable functions proved using properties of the Heaviside and delta distributions. Consequently we obtain conditions which guarantee approximation error rate of order O by one-hidden-layer networks with n sigmoid...
متن کاملOn the Storage Capacity of Nonlinear Neural Networks
We consider the Hopfield associative memory for storing m patterns xi(r) in { - 1, + 1}(n), r = 1, em leader,m. The weights are given by the scalar product model w(ij)=(m/n)G,i not equal j,w(ii) identical with 0, where G:R --> R is some nonlinear function, like G(x) z.tbnd6; Sgn(x), which is used in hardware implementation of associative memories. We give a rigorous lower bound for the memory s...
متن کاملPii: S0893-6080(98)00147-6
Neural networks (NN) are used in this paper to tune PI controllers for unknown plants, which may be nonlinear or open-loop unstable. A simple algorithm, which requires only knowledge of the plant output response direction, is used for training an NN controller, by employing the error between the reference and the plant output. Once this controller achieves good performance, its input–output beh...
متن کامل